NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A Structure-Aware Framework for Learning Device Placements on Computation Graphs

Duan, Shukai; Ping, Heng; Kanakaris, Nikos; Xiao, Xiongye; Kyriakis, Panagiotis; Ahmed, Nesreen K; Zhang, Peiyu; Ma, Guixiang; Capotă, Mihai; Nazarian, Shahin; et al (December 2024, NeurIPS)

Computation graphs are Directed Acyclic Graphs (DAGs) where the nodes correspond to mathematical operations and are used widely as abstractions in optimizations of neural networks. The device placement problem aims to identify optimal allocations of those nodes to a set of (potentially heterogeneous) devices. Existing approaches rely on two types of architectures known as grouper-placer and encoder-placer, respectively. In this work, we bridge the gap between encoder-placer and grouper-placer techniques and propose a novel framework for the task of device placement, relying on smaller computation graphs extracted from the OpenVINO toolkit. The framework consists of five steps, including graph coarsening, node representation learning and policy optimization. It facilitates end-to-end training and takes into account the DAG nature of the computation graphs. We also propose a model variant, inspired by graph parsing networks and complex network analysis, enabling graph representation learning and jointed, personalized graph partitioning, using an unspecified number of groups. To train the entire framework, we use reinforcement learning using the execution time of the placement as a reward. We demonstrate the flexibility and effectiveness of our approach through multiple experiments with three benchmark models, namely Inception-V3, ResNet, and BERT. The robustness of the proposed framework is also highlighted through an ablation study. The suggested placements improve the inference speed for the benchmark models by up to over CPU execution and by up to compared to other commonly used baselines.
more » « less
Full Text Available
Unlocking Deep Learning: A BP-Free Approach for Parallel Block-Wise Training of Neural Networks

https://doi.org/10.1109/ICASSP48485.2024.10447377

Cheng, Anzhe; Ping, Heng; Wang, Zhenkun; Xiao, Xiongye; Yin, Chenzhong; Nazarian, Shahin; Cheng, Mingxi; Bogdan, Paul (April 2024, IEEE)
Ko, Hanseok (Ed.)
Backpropagation (BP) has been a successful optimization technique for deep learning models. However, its limitations, such as backward- and update-locking, and its biological implausibility, hinder the concurrent updating of layers and do not mimic the local learning processes observed in the human brain. To address these issues, recent research has suggested using local error signals to asynchronously train network blocks. However, this approach often involves extensive trial-and-error iterations to determine the best configuration for local training. This includes decisions on how to decouple network blocks and which auxiliary networks to use for each block. In our work, we introduce a novel BP-free approach: a block-wise BP-free (BWBPF) neural network that leverages local error signals to optimize distinct sub-neural networks separately, where the global loss is only responsible for updating the output layer. The local error signals used in the BP-free model can be computed in parallel, enabling a potential speed-up in the weight update process through parallel implementation. Our experimental results consistently show that this approach can identify transferable decoupled architectures for VGG and ResNet variations, outperforming models trained with end-to-end backpropagation and other state-of-the-art block-wise learning techniques on datasets such as CIFAR-10 and Tiny-ImageNet. The code is released at https://github.com/Belis0811/BWBPF.
more » « less
Full Text Available
Ultrathin GaN Crystal Realized Through Nitrogen Substitution of Layered GaS

https://doi.org/10.1007/s11664-023-10670-w

Cao, Jun; Li, Tianshu; Gao, Hongze; Cong, Xin; Lin, Miao-Ling; Russo, Nicholas; Luo, Weijun; Ding, Siyuan; Wang, Zifan; Smith, Kevin E.; et al (November 2023, Journal of Electronic Materials)

Full Text Available
Phonon renormalization in reconstructed MoS2 moiré superlattices

https://doi.org/10.1038/s41563-021-00960-1

Quan, Jiamin; Linhart, Lukas; Lin, Miao-Ling; Lee, Daehun; Zhu, Jihang; Wang, Chun-Yuan; Hsu, Wei-Ting; Choi, Junho; Embley, Jacob; Young, Carter; et al (January 2021, Nature Materials)
null (Ed.)
In moiré crystals formed by stacking van der Waals materials, surprisingly diverse correlated electronic phases and optical properties can be realized by a subtle change in the twist angle. Here, we discover that phonon spectra are also renormalized in MoS2 twisted bilayers, adding an insight to moiré physics. Over a range of small twist angles, the phonon spectra evolve rapidly owing to ultra-strong coupling between different phonon modes and atomic reconstructions of the moiré pattern. We develop a low-energy continuum model for phonons that overcomes the outstanding challenge of calculating the properties of large moiré supercells and successfully captures the essential experimental observations. Remarkably, simple optical spectroscopy experiments can provide information on strain and lattice distortions in moiré crystals with nanometre-size supercells. The model promotes a comprehensive and unified understanding of the structural, optical and electronic properties of moiré superlattices.
more » « less
Full Text Available
Deep learning to quantify the pace of brain aging in relation to neurocognitive changes

https://doi.org/10.1073/pnas.2413442122

Yin, Chenzhong; Imms, Phoebe; Chowdhury, Nahian F; Chaudhari, Nikhil N; Ping, Heng; Wang, Haoqing; Bogdan, Paul; Irimia, Andrei; Weiner, Michael; Aisen, Paul; et al (March 2025, Proceedings of the National Academy of Sciences)

Brain age (BA), distinct from chronological age (CA), can be estimated from MRIs to evaluate neuroanatomic aging in cognitively normal (CN) individuals. BA, however, is a cross-sectional measure that summarizes cumulative neuroanatomic aging since birth. Thus, it conveys poorly recent or contemporaneous aging trends, which can be better quantified by the (temporal) pace P of brain aging. Many approaches to map P, however, rely on quantifying DNA methylation in whole-blood cells, which the blood–brain barrier separates from neural brain cells. We introduce a three-dimensional convolutional neural network (3D-CNN) to estimate P noninvasively from longitudinal MRI. Our longitudinal model (LM) is trained on MRIs from 2,055 CN adults, validated in 1,304 CN adults, and further applied to an independent cohort of 104 CN adults and 140 patients with Alzheimer’s disease (AD). In its test set, the LM computes P with a mean absolute error (MAE) of 0.16 y (7% mean error). This significantly outperforms the most accurate cross-sectional model, whose MAE of 1.85 y has 83% error. By synergizing the LM with an interpretable CNN saliency approach, we map anatomic variations in regional brain aging rates that differ according to sex, decade of life, and neurocognitive status. LM estimates of P are significantly associated with changes in cognitive functioning across domains. This underscores the LM’s ability to estimate P in a way that captures the relationship between neuroanatomic and neurocognitive aging. This research complements existing strategies for AD risk assessment that estimate individuals’ rates of adverse cognitive change with age.
more » « less
Free, publicly-accessible full text available March 11, 2026

Search for: All records